How Well Does Your Instance Matching System Perform? Experimental Evaluation with LANCE

نویسندگان

  • Tzanina Saveta
  • Evangelia Daskalaki
  • Giorgos Flouris
  • Irini Fundulaki
  • Axel-Cyrille Ngonga Ngomo
چکیده

Identifying duplicate instances in the Data Web is most commonly performed (semi-)automatically using instance matching frameworks. However, current instance matching benchmarks fail to provide end users and developers with the necessary insights pertaining to how current frameworks behave when dealing with real data. In this paper, we present the results of the evaluation of instance matching systems using Lance, a domain-independent, schema agnostic instance matching benchmark generator for Linked Data. Lance is the first benchmark generator for Linked Data to support semantics-aware test cases that take into account complex OWL constructs in addition to the standard test cases related to structure and value transformations. We provide a comparative analysis with benchmarks produced using the Lance framework for different domains to assess and identify the capabilities of state of the art instance matching systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LANCE: Piercing to the Heart of Instance Matching Tools

One of the main challenges in the Data Web is the identification of instances that refer to the same real-world entity. Choosing the right framework for this purpose remains tedious, as current instance matching benchmarks fail to provide end users and developers with the necessary insights pertaining to how current frameworks behave when dealing with real data. In this paper, we present Lance,...

متن کامل

LANCE: A Generic Benchmark Generator for Linked Data

Identifying duplicate instances in the Data Web is most commonly performed (semi-)automatically using instance matching frameworks. However, current instance matching benchmarks fail to provide end users and developers with the necessary insights pertaining to how current frameworks behave when dealing with real data. In this demo paper, we present Lance, a domain-independent instance matching ...

متن کامل

Pushing the Limits of Instance Matching Systems: A Semantics-Aware Benchmark for Linked Data

The architectural choices behind the Data Web have led to the publication of large interrelated data sets that contain different descriptions for the same real-world objects. Due to the mere size of current online datasets, such duplicate instances are most commonly detected (semi-)automatically using instance matching frameworks. Choosing the right framework for this purpose remains tedious, a...

متن کامل

An Approach for Automatic Matching of Descriptive Addresses

Address matching (also called geocoding) is an applied spatial analysis which is frequently used in everyday life. Almost all desktop and web-based GIS environments are equipped with a module to match the addresses expressed in pre-defined standard formats on the map. It is an essential prerequisite for many of the functionalities provided by location-based services (e.g. car navigation). Sever...

متن کامل

Query by Humming: How good can it get?

When explaining the Query-by-humming (QBH) task, it is typical to describe it in terms of a musical question posed to a human expert, such as a music-store clerk. An evaluation of human performance on the task can shed light on how well one can reasonably expect an automated QBH system to perform. This paper describes a simple example experiment comparing three QBH systems to three human listen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016